Delphi studies in social and health sciences—Recommendations for an interdisciplinary standardized reporting (DELPHISTAR). Results of a Delphi study

Background While different proposals exist for a guideline on reporting Delphi studies, none of them has yet established itself in the health and social sciences and across the range of Delphi variants. This seems critical because empirical studies demonstrate a diversity of modifications in the conduction of Delphi studies and sometimes even errors in the reporting. The aim of the present study is to close this gap and formulate a general reporting guideline. Method In an international Delphi procedure, Delphi experts were surveyed online in three rounds to find consensus on a reporting guideline for Delphi studies in the health and social sciences. The respondents were selected via publications of Delphi studies. The preliminary reporting guideline, containing 65 items on five topics and presented for evaluation, had been developed based on a systematic review of the practice of Delphi studies and a systematic review of existing reporting guidelines for Delphi studies. Starting in the second Delphi round, the experts received feedback in the form of mean values, measures of dispersion, a summary of the open-ended responses and their own response in the previous round. The final draft of the reporting guideline contains the items on which at least 75% of the respondents agreed by assigning scale points 6 and 7 on a 7-point Likert scale. Results 1,072 experts were invited to participate. A total of 91 experts completed the first Delphi round, 69 experts the second round, and 56 experts the third round. Of the 65 items in the first draft of the reporting guideline, consensus was ultimately reached for 38 items addressing the five topics: Title and Abstract (n = 3), Context (n = 7), Method (n = 20), Results (n = 4) and Discussion (n = 4). Items focusing on theoretical research and on dissemination were either rejected or remained subjects of dissent. Discussion We assume a high level of acceptance and interdisciplinary suitability regarding the reporting guideline presented here and referred to as the "Delphi studies in social and health sciences–recommendations for an interdisciplinary standardized reporting" (DELPHISTAR). Use of this reporting guideline can substantially improve the ability to compare and evaluate Delphi studies.


Introduction
Internationally, Delphi studies have proven themselves in a variety of disciplines and fields of application.Analyses show a growing prevalence of this technique, especially in the contexts of medicine, science and technology, and the social sciences [1].They represent an important tool for analyzing potential future conditions [2,3].Associated with this is the idea of collective intelligence, according to which the prognostic ability of a group of experts is better than that of a single expert [4].In the context of health sciences research, Delphi studies are used in the medical and natural sciences [5] and the behavioral social sciences [6].They are selected for use if little or inconsistent evidence is available [7], or primary studies are not possible because of economic, ethical, or pragmatic reasons, or there are practical challenges in clinical or nursing contexts.
Due to the prevalence of Delphi studies [1, 8], different authors have already formulated proposals for reporting Delphi studies [9][10][11][12].One guideline has been published using the acronym CREDES (Guidance on Conducting and REporting DElphi Studies) [9].Another has been published using the keyword ACCORD (ACcurate COnsensus Reporting Document) [13,14].Yet none of these reporting guidelines claims to be valid for the many diverse areas of application or Delphi variants in the health and social sciences.This gap should be closed with the help of the study presented here, in that we develop the reporting guideline "DELPHIS-TAR-Delphi studies in social and health sciences-recommendations for an interdisciplinary standardized reporting."

Characteristics and variants of Delphi techniques
Delphi techniques are structured survey procedures in which complex topics, on which uncertain or incomplete knowledge exists, are evaluated by experts in an iterative process [15].Specific to a Delphi procedure is that the survey is repeated and, from the second survey round onwards, information is shared regarding the results of the previous round enabling the respondents to reconsider their judgments and, if needed, revise them.Five typical characteristics of the Delphi process can be gleaned from the methods literature [7,16]: 1. Experts are surveyed while typically preserving their anonymity.
2. The survey is conducted in at least two Delphi rounds.3. A standardized questionnaire is used, often with open-ended questions to gather arguments and capture the horizons of legitimation.
4. The statistical analysis is based on descriptive calculations.
5. From the second Delphi round onwards, the experts receive feedback on the results of the previous round along with the questionnaire and can thus reconsider and, if necessary, revise their judgments.
Some authors define the Delphi process more narrowly and focus on the finding of consensus among the expert judgments [17,18].According to Dalkey and Helmer [19], the process is suitable "to obtain the most reliable consensus of opinion of a group of experts. ..by a series of intensive questionnaires interspersed with controlled feedback."Narrowing the definition to consensus, however, seems discriminating given the many different settings in which Delphi studies are applied, for instance, to forecast future developments [3] or discover and aggregate knowledge [20].
In recent years many variations of the Delphi procedure have been developed [21,22].More than 10 different variants have already been identified [23,24].The Delphi variants differ from each other in terms of process design, for instance, whether or not the Delphi rounds are held separately or overlap with each other, in the weighting of open-ended and standardized responses, and also in regard to the expert panel, e.g., group size and the handling of anonymity [24,25].Among the Delphi variants are both established variants and some that have hardly been used before: • Real-time Delphi, in which expert judgments are reflected back online and in real time.
• Delphi markets, where the Delphi concept is combined with virtual marketing platforms (prediction markets) and the findings of Big Data research to improve abilities to forecast the future and the quality on which such predictions are based [27].
• Policy Delphis are concerned with capturing dissent, meaning a wide range of diverse judgments [16,28].
• Argumentative Delphi, where the focus is on the qualitative reasoning for the experts' quantitative evaluations [23].
• Group Delphi, for which the experts are invited to a workshop to openly formulate and discuss arguments in favor of divergent judgments [29,30].
• Deliberative Delphi (citizens ' Delphi), in which citizens are surveyed iteratively.In between the Delphi rounds, they are trained to make informed and responsible judgments [31].
• Fuzzy Delphi applies different analytical strategies to quantify the linguistic labels often used in the Likert scales to allow for potential differences in the understanding of these expressions when calculating mean values [32].
• Café Delphi, in which a smaller number of experts are surveyed in an informal, "cafe ´-like" atmosphere [33].
A look at the paper published by Mullen in 2003 [34] makes it clear that this list here is far from complete.She identifies more than ten additional Delphi variants (e.g., Delphi conference, decision Delphi, Delphi forecast, ranking Delphi), but without defining them more closely or differentiating them from one another.Furthermore, different systematic reviews report on countless other, hardly nameable or understandable, modifications of Delphi procedures [9,35].
The differentiation between Delphi variants is accompanied by epistemological and methodological specifications regarding the classic Delphi design, which also affects the characteristics.Hence, the definition of "expert" is broadened to include not only people in certain professional positions or who have attained academic excellence, but also people with a specific kind of lifeworld experience, which then means that experts are not just members of certain professions, but also patients, patients' relatives, or users [36,37].
From an epistemological standpoint, newer Delphi studies are often based on constructivist assumptions and use not only standardized questionnaires, but also explorative instruments in the form of open central questions [38] or workshops [39].Ensuring anonymity, however, remains a constant in the evolution of the Delphi technique; the names of participating experts are published only in exceptional cases [30,40].
Given the often considerably limited scope of journal articles, it is sometimes impossible to present and justify the use of the selected Delphi variant and any modifications to it, such that it is all sufficiently transparent to outsiders.In the following, a look at publication practices suggests, at the least, how these aspects are addressed.

Reporting Delphi studies
Different systematic reviews document unclear or potentially misleadingly formulated approaches in Delphi studies [41].There are sometimes even errors in the presentation of the method or statistical analysis [42].As an example, even a survey of experts in a single round is declared to be a Delphi study [43].In respect to presenting the methodological approach, questions remain unanswered, for instance, regarding the form of feedback [44], why the selected number of rounds was chosen [45], at what point "consensus" was defined [46], and how high the response rate for each Delphi round was [47].
A recognized reporting guideline can help to counteract such methodological misunderstandings and imprecisions.Ultimately, the quality of Delphi studies can also be improved through more transparency.This is the aim pursued by the present study concerning the development of the reporting guideline "DELPHISTAR-Delphi studies in social and health sciences-recommendations for an interdisciplinary standardized reporting."

Background
The scientific network DEWISS has set the goal of developing a reporting guideline for Delphi studies that is valid for the different Delphi variants and diverse fields within the health and social sciences (more information is available at https://delphi.ph-gmuend.de/).The Germanspeaking DEWISS Network is comprised of 20 scientists and academics from different subject areas and disciplines.All of the members conduct Delphi studies in the context of their research and grapple with the methodological and epistemological aspects of Delphi techniques.They perform methodological tests, carry out surveys to improve the methodological basis of Delphi studies, advise other researchers on how to conduct Delphi studies, and develop concepts and materials that can be used to teach about Delphi procedures (e.g., short videos at https://delphi.ph-gmuend.de/).Since its founding, this network has received funding from the German Research Foundation (Deutsche Forschungsgemeinschaft, DFG), an overarching institution providing support for science and research in the Federal Republic of Germany (project number 429572724, time period 2020 to 2024).
• First sub-study: In the first step, an overview of Delphi studies was created from a methodological standpoint [41].A total of 16 previous reviews of Delphi studies were identified, systematically evaluated, and the results summarized in a map [41].It was seen here that, among other things, there is a diversity of approaches and, in some instances, unexamined modifications to Delphi studies.The research team's awareness of the relevant aspects and the necessity for a reporting guideline was raised by these findings.
• Second sub-study: In a systematic review, ten earlier recommendations for reporting Delphi studies were identified, analyzed in terms of content, and examined for commonalities and differences [48].In the course of this, it was seen, among other things, that these previous recommendations did not claim to have validity across disciplines or for different Delphi variants.The recommendations were often developed for a specific research area, e.g., palliative medicine [9] or medical education [49].This is possibly the reason why the proposal published in the EQUATOR Network by Ju ¨nger et al. [9] did not result in any fundamental improvement in reporting practices [35].• Third sub-study: The results gathered from the first two sub-studies were discussed in the DEWISS Network and transformed into a comprehensive reporting guideline for Delphi studies.Consensus among additional Delphi experts was reached on this reporting guideline by means of a Delphi procedure.The selection of the Delphi method is justified by the fact that it is also recommended by other authors for the development of a reporting guideline [50].The Delphi process is presented in the following.

The Delphi process
International experts on Delphi procedures were surveyed for the purpose of developing a reporting guideline for Delphi studies.The aim was to find consensus on the reporting criteria.
The approach was based on the "classic" Delphi technique with three rounds that were carried out online (Fig 2).Digital collection of data is now an established part of Delphi procedures [25].However, since our process exhibits the five typical characteristics of a Delphi procedure (see Introduction), we identify our study as a "classic" Delphi.In doing so, we allot a relatively high importance to the free-text responses, in that we analyze them systematically, combine them with the quantitative data, and use them to fine-tune the wording in the reporting guideline.

Questionnaire development
The questionnaire was developed by the DEWISS Network on the basis of the first two substudies [41,48].These sub-studies identified existing reporting guidelines and research methods, and the findings were synthesized during several DEWISS network meetings (Table 1).
The results were incorporated in the first draft of the reporting guideline for Delphi studies.
For this, we selected a structured sequence organized by topics and sections because this resembles established reporting guidelines, particularly the PRISMA guideline for systematic reviews [51].Finally, items covering five specific topics, each with up to seven sections, were contained in the initial questionnaire (Table 1).They are presented here as they appear in the final version of the reporting guideline.
The proposed content of the reporting guideline was queried in the form of standardized items on a 7-point rating scale ("1 = very unimportant, 7 = very important" or "1 = very unlikely, 7 = very likely") (Fig 3).Different rating scale widths have been established in Delphi studies [9,52].Firstly, they enable a separate evaluation of each item; secondly, an experimental study shows that for those taking the survey, the completion time is quicker and the cognitive effort is lower when compared to ranking scales [53].This is an important argument in regard to participant motivation.With this in mind, we deliberately chose an odd-numbered scale width.Taze et al. [54], to cite one example, also recommend this for Delphi studies.The items were deliberately formulated so that it was possible to understand them without further explanation.Even so, examples were still included in some instances.Each item was programmed as a required question.For this reason, there was always an evasive option available ("cannot evaluate this item").
Also, in all three of the rounds the experts were asked in a standardized manner about the certainty of their judgment ("1 = extremely uncertain, 7 = absolutely certain") so that this could be taken into consideration in the analysis.In the first and second Delphi rounds it was possible to comment freely after each topic (see Fig 3).The free-text boxes were each limited to 300 characters.In the third and final survey round it was possible to comment freely at the end of the survey without any limitations on the character count.
Table 1.Reporting guideline.Overview of the items that were evaluated according to topic and section.Also integrated into the survey were questions about the respondents' expertise (discipline, country, experience with Delphi studies, proficiency as a Delphi practitioner).These served to describe the sample.

Topic
The survey was conducted in English.The initial questionnaire, including the reporting guideline, was translated by a native English speaker and then reviewed for accuracy by methods experts at the Leibniz Institute for the Social Sciences (GESIS), a renowned German research institute in the empirical social sciences.In all three of the Delphi rounds experts were requested not to use any machine translation tools in order to avoid any distortions as a result of translation errors.
The comprehensibility of the questions and the technical functioning of the online survey were tested prior to each Delphi round by DEWISS Network members who had not directly collaborated in the questionnaire development.

Selecting the experts
Considered as experts were academics who had conducted several Delphi studies themselves and/or who were working on methodological issues related to the Delphi technique.These experts were identified via publications.A search was conducted of two databases compiled by the DEWISS Network and freely accessible through ZOTERO [48].The first database contains Delphi primary studies (available at: https://www.zotero.org/groups/4396781/dewiss_datenbanken_delphi-studien/collections/25H44TFI), and the second has publications based on the methodology of Delphi studies (e.g., reviews, methods experiments; available at: https:// www.zotero.org/groups/4396781/dewiss_datenbanken_delphi-studien/collections/NGTBI3PE).Both databases were created in 2021 based on systematic research of the literature in the central databases for health and social sciences (Scopus, MEDLINE via PubMed, CINAHL and Epistemonikos) and contain Delphi studies and methods papers published between 2016 and 2021.The search was conducted using the keyword "delphi*" in the title or abstract.Publications were included if they involved methodological publications regarding Delphi studies or Delphi primary studies in the health or social sciences.The collection of methods-based studies includes 155 papers and the one with Delphi primary studies comprises 7,044 papers [48].Authors who had published at least five papers (n = 863) were filtered out of the primary study collection.All lead and senior authors (n = 228) were filtered out of the database containing the methods studies.Nineteen authors were present in both databases so that, in the end, 1,072 Delphi experts were identified and invited to participate in the Delphi study.The author information listed in the publications was used as the contact information.The sample contained 352 women and 710 men (10 unclear) from 47 countries (TOP 5: USA, England/UK, Australia, Canada, Italy).
Participation in the Delphi study was voluntary and anonymous.Informed consent was obtained from all of the participants at the beginning of the survey using an online form.The study design complies with the Helsinki Declaration [55], with regard for the European General Data Protection Regulation [56] and the principles of the DFG [57].

Data collection
The programming and sending of the questionnaire was done using Unipark software [58].The invitation email contained a personalized link to the questionnaire and a PDF attachment with the contents of the reporting guideline that were to be evaluated.The time period for the survey was always a minimum of four weeks, during which two to three reminders to participate in the Delphi study were sent (Fig 2).Along with each survey questionnaire, the experts also received a PDF of the preliminary reporting guideline.Each time it was made clear which items had been agreed on, which items had been reworded, and if any new items had been added.
First Delphi round.All of the identified experts (n = 1,072) were invited by email to participate in the first Delphi round.Due to security rules at some institutions, some of the emails were blocked, which is why only 87% (n = 934/1,072) of the emails were deliverable.
Second Delphi round.The initial questionnaire was revised based on the results of the first Delphi round, meaning that consented items were removed and the remaining items were reworded as necessary based on the free-text comments.The changes in wording were highlighted in color so that the experts could see and understand them.The revisions served to fine-tune the semantics and validate the changes by passing them back to the surveyed experts [59].This approach is often described in "classic" Delphi studies [60,61].
The experts received feedback on the statistical group response (aggregated percent agreement on the scale points 6+7, mean value, standard deviation) from the previous round and a summary of the arguments made in the open-ended responses.In addition, the experts were able to see their own responses to the standardized items from the previous round.Furthermore, the definition of consensus was also communicated to the experts.
Experts who had completed the first round were contacted one week before the second Delphi round informing them about it and requesting them to participate again.
Third Delphi round.The questionnaire was revised anew and shortened based on the results of the second Delphi round.Shortening the questionnaire was also undertaken as a measure to maintain participants' motivation to participate.
As feedback, experts received the statistical group response from the previous round and again were able to see their own responses to the standardized items.Since there were only a few new arguments in the open-ended responses and these had been integrated into the questionnaire as part of the revision process, no summary of the arguments made in the openended responses was included with the questionnaire at this point in the process.Changes in the wording were, however, again made visible using color highlighting.

Data analysis
Statistical analysis was performed using R [62].The responses to the standardized questions were descriptively analyzed (absolute and relative frequencies, minimum, maximum, mean, median, standard deviation).Consensus was defined a priori as follows: Consensus for the inclusion of an item in the reporting guideline exists if at least 75% of the responses assign the scale values of 6 or 7 (very important) on the 7-point rating scale.From the second round onward, all items with a rejection rate of at least 50% were excluded, meaning that less than half of the responses assigned the scale values of 6 or 7 on the 7-point rating scale.Items for which consensus had already been reached were not presented again for evaluation in the subsequent rounds.
Analysis of the open-ended responses from the text boxes was done using the Argumentbased QUalitative Analysis strategy (AQUA) [63] with Microsoft Word (2019).The AQUA method is based on established analytical methods in qualitative social research and was developed further for the analysis of qualitative data from Delphi studies.When applying the AQUA method, arguments from the open-ended responses are extracted and categorized by topic [63].No quantification regarding frequency of mentions was undertaken.The arguments in each Delphi round were discussed in the DEWISS Network and, if needed, used to reword the items on the questionnaire.

Ethical approval
The ethics commission at the University of Education Schwa ¨bisch Gmu ¨nd granted written approval on 10 July 2023, rendering an ethics vote unnecessary.

Results
Of the 934 experts invited to the first Delphi round, 91 (10%) completed the survey.The second Delphi round had a response rate of 76% (n = 69/91), the third had a response rate of 81% (n = 56/69).Overall, experts hailed from 22 countries (round 1), 20 countries (round 2) and 19 countries (round 3), with about half of the experts working in one of five countries: USA, UK, Canada, Australia and China.The distribution in terms of region and discipline remained comparable for all rounds (Table 2).Between 87% and 89% of the experts in each of the rounds stated that they were associated with the health sciences; the others belonged more to the social sciences (Table 2).The central tendency involving publications by the experts is similar across all of the rounds.The number of Delphi studies personally conducted by the participating experts is on average clearly lower in the first Delphi round than in the two subsequent rounds.The results of the self-assessed expert profile and response behavior show only minor fluctuations in the relative frequencies for the rounds (Table 2).The majority of the experts judged their ability to apply classic Delphi techniques as excellent (scale points 6+7 out of 7), whereas less than 50% assessed their abilities to be excellent in regard to the real-time Delphi, group Delphi and policy Delphi.For the other Delphi variants, only 5% or fewer of the experts judged their competence to be high.
All of the judgments were included in the analysis, and the statements on judgment certainty were taken into account when analyzing the items for content and revising the questionnaire because, in all of the Delphi rounds and for all of the topics, the experts on average (median 6) responded with good levels of judgement certainty and the variance among the responses was low (standard deviation �1.2).Ability of Delphi variants (Scale: 1 = absolutely no ability to 7 = excellent ability) 3Scale value 6+7 in %, mean (sd) In total, 65 items were presented for evaluation regarding the reporting guideline.At the end of the three Delphi rounds consensus was found for the inclusion of over 38 items in the reporting guideline for Delphi studies in the health and social sciences (S1 File).The points of agreement and disagreement are discussed below.

Topic: Title and abstract
Consent was reached for all of the items asked about the topic of Title and Abstract.The majority of the experts said it is important that Delphi studies can be identified through their titles and abstracts and that the abstract's content should be structured (Table 3).

Topic: Context
The topic of Context was covered in three sections: formal, theory and content.For section on formal aspects, it was possible to reach agreement on five items (Table 4).According to the experts' opinions, information about funding sources, author team, methods consulting, project background, and the study protocol are important topics for a Delphi reporting guideline.Dissent exists on whether information about the time point of a Delphi study, an ethics vote, or additional information about project background need to be reported.In terms of an ethics vote, it is "typically not required to perform a Delphi in health sciences, since it does not involve human subjects" (free-text comment in the second Delphi round).The experts did not agree to include any item from the section on theory in the reporting guideline (Table 4).In regard to the item about research paradigm, the free-text responses displayed opposing patterns of argument.Several of the respondents viewed Delphi studies as belonging to the quantitative paradigm ("A qualitative questionnaire is qualitative research, not Delphi"; commentary from the first Delphi round).For these experts, Delphi judgments have a universal and evidence-based character.Other respondents assigned Delphi studies to the qualitative paradigm ("A Delphi study has the aim to communicate and have a discussion, it is qualitative research"; commentary from the second Delphi round).This latter group emphasizes the relevance of open-ended questions in Delphi procedures, e.g., to gather context for specific judgments.
In the section covering content, justifying the selected method and stating the aim of a Delphi study are central elements of reporting (Table 4).What is not necessary, according to the respondents, is reporting within the context of current social developments.Disagreement remains about the items on making the relevance of a study clear.The argument against this is a pragmatic one, namely that a reporting guideline cannot cover all conceivable aspects.

Topic: Method
The topic of Method was divided into seven sections: body & integration of knowledge, Delphi variations, sample of experts, survey, Delphi rounds, feedback, and data analysis.Consent was found for reporting on all three of the items asked about in the section on the body & integration of knowledge (Table 5), Accordingly, the identification of relevant expertise, the handling of missing knowledge, and an explanation of who is considered an expert in a particular Delphi study are considered important aspects when reporting a Delphi study.
In the section addressing Delphi variations, the experts agreed that it is important to identify and justify the Delphi variants and any modifications (Table 5).
In the section on the sample of experts, the selection criteria, how experts were found, and information about the recruitment process must be described (Table 5).How anonymity was handled was not viewed as relevant by the experts.The arguments in the free-text comments for disclosing respondents' identities included a better understanding of the judgments; the counterargument posed the question whether the relevant people would still participate in that case.Dissent remained concerning the relevance of reporting dropouts.
Eleven items were proposed in the section on survey, for which agreement on two items was reached (Table 5).The experts considered a general description of the questionnaire's development and the survey process to be relevant.What was found irrelevant or remained in dissent were, among other things, items regarding the pretest of the questionnaire and naming the software used.
In the sections about Delphi rounds and feedback, the experts agreed on reporting the number of rounds, identifying the objectives of each Delphi round, defining a termination criterion, and giving a detailed description of the feedback's design, including if group-specific analysis should be made available or, if applicable, how dissent was handled (Table 5).
In the section covering data analysis, it was agreed that the analytical methods applied to quantitative and qualitative data, the definition of consensus, and information regarding subgroup analysis or the weighting of the expert groups must be reported (Table 5).The percentage agreement for reporting the software used for analysis lies below the defined value for consensus.

Topic: Results
The topic involving Results contained the two sections on Delphi process and results.In the section on Delphi process there is consensus that the process, the number of experts per Delphi round, and any unexpected events during the Delphi process must all be reported (Table 6).Not included in the consensus are the reporting of sociodemographic characteristics and information about the experts' competency.Emerging from the free-text comments is the observation that it is difficult to define and competence.
In the section focused on results the experts argued for presenting the results of each round (Table 6).

Topic: Discussion and dissemination
The topic of Discussion and Dissemination was subdivided into the two sections on quality of findings and dissemination.Belonging to the section on quality of findings is the reporting of a study's results, the validity and reliability of the findings, and possible limitations of a Delphi study (Table 7).With 74%, the agreement on the external validity of the results lies just under the cut-off value which requires 75% agreement.
No items from the section on dissemination will be included (Table 7).

Discussion
The proposed reporting guideline for Delphi studies in the health and social sciences encompasses a total of 38 items that have been agreed upon by an international expert panel of Delphi practitioners.By including experts from different subject areas and with broad range of Delphi knowledge, we assume that the DELPHISTAR Reporting Guideline will be received very well by the scientific community.It is comparable in its scope to established guidelines, e.g., CONSORT [64] (37 items) and PRISMA [51] (42 items).The requirement of 75% for a consensus resulted in the exclusion of several items that in some cases only very narrowly failed to meet this criterion; and in future discussions regarding the reporting guideline, it would be worth considering the possible inclusion of these items as "desirable" based on some type of grading system [65].Ten items (e.g., external validity, information about expert competency) achieved a consensus ranging from more than 60% up to 74% in the third Delphi round.A consensus ranging between 50% and 59% reached in the third round for five items (e.g., information about the software used for analysis, information about the validity of the items/ scales).
First and foremost, we expect an improvement in the reporting of Delphi studies.The potential for this is demonstrated by analyses of existing reporting guidelines, for instance, studies evaluating the Consolidated Standards of Reporting Trials (CONSORT) checklists show that the use of the reporting guideline is associated with an improved reporting of randomized controlled trials [66,67].We also expect to see a simplification or harmonization of the review process for Delphi studies and a raised awareness in Delphi practitioners about the quality of Delphi studies.
That said, the implementation of this recommended guideline is also contingent on whether journals require and check for the use of the guideline [67].It is no less important for us, as the DEWISS Network, to promote DELPHISTAR to familiarize the target fields with it and to publish in the EQUATOR network.In terms of dissemination, we intend to create our own website, upload a short video via social media, and also inform the publishers of relevant journals and Delphi practitioners via email.Regarding this specific objective, the participating experts will be explicitly asked for their evaluation of the reporting guideline and their participation in the Delphi after the fact [52].By doing this, we hope to gain information and insights concerning the quality of this Delphi study and future Delphi procedures.
Several items remained without agreement or did not meet the previously defined criterion for consent, the reason for which could possibly be traced back to the lack of methods research.This is seen in regard to three aspects: 1.The agreement to exclude items involving theory is a sign of absent discussions about the theoretical positioning of Delphi studies.Nonetheless, this would still be important because the definition of an epistemological aim is directly connected with the selection of quality criteria for Delphi studies [68].Delphi studies that are more qualitative must be measured against criteria such as transparency or intersubjective comprehensibility; whereas quantitative Delphi studies have more to do with criteria such as scale quality and reliability of the results [23].Admittedly, no established criteria yet exist to evaluate the quality of Delphi studies, even though initial proposals are available [52,69].
2. The dissent around the items involving expert competency or scale validity could indicate that there is still too little methods research on this that investigates the potential influence of these aspects on judgement behavior and, thus, on the results [70].
3. Evaluations of Delphi studies could also provide new information.To date, such evaluations are carried out only in individual instances [71], but could yield important insights regarding the participants' motivations and judgment behaviors.This knowledge could also be relevant to further development of the Delphi reporting guideline.
We make the claim that DELPHISTAR can be used with different Delphi variants.Viewed from a quantitative perspective, it could be critically said that most of the participating experts consider their expertise to be in the classic Delphi, real-time Delphi, policy Delphi and the group Delphi.This was to be expected because, despite the increasing differentiations and methodological modifications, these are the most frequently used Delphi variants [41,72].Argued from a qualitative standpoint, we assume based on our sampling method that the individually surveyed experts have a very high level of proficiency in the Delphi techniques covered by the questionnaire.Despite this, we are not able to determine with certainty that the items in the reporting guideline can be applied to all of the and modifications to Delphi procedures.It is also for this reason that we plan to take a further step to test this reporting guideline on a defined random sample of publications in order to ensure feasibility.

Strengths and limitations
The results of our Delphi survey must be viewed in the context of the expert panel and the survey time point in 2022.We assume that the use and applicability of DELPHISTAR must be subject to ongoing critical reflection.It is possible that items which were not included in the reporting guideline will be required by reviewers (e.g., "time period in which the Delphi study was conducted").Furthermore, technical innovations, methodological developments and discussions regarding methods can affect Delphi studies thus changing the criteria for reporting them (e.g., "information about the software used for analysis").This suggests that discussions about the participation of affected persons in Delphi studies conducted in clinical or nursing contexts will become increasingly more important, very possibly making methodological modifications to Delphi techniques necessary [73,74].Information regarding ethical approval would become much more important as a consequence.
In the Delphi study presented here, it was possible to achieve a typical response rate for international online Delphi studies, with approximately 10% [75].Reasons why experts did not participate could involve language barriers or not receiving the emails.Using private email addresses for this would be conceivable, as several authors recommend [76].It is possible that the regular reminder may have been effective in encouraging participation in all three Delphi rounds, in that, among other things, the actual completion time (average time for the experts participating up to that point) was included in the feedback.
The expert panel's geographic heterogeneity was successfully maintained.Nevertheless, biases in the panel could be present due to the predominance of experts with a background in the health sciences.Furthermore, only Delphi experts who published between 2016 and 2021 were included.It is possible that, as a consequence, specialists who also possess a high level of expertise and an impressive publication history in this field were excluded.
A relatively strict consensus criterion of 75% was selected for this Delphi study, which results in items being either kept or rejected.Considerations could have been made to divide the results into different categories, for example, into three categories with a) items of highly consensual and necessary inclusion (e.g., 75% and above), b) items of desirable and generally necessary inclusion (e.g., between 60% and 75%), and c) possible items of inclusion depending on the study and study objectives (less than 60%).Following this strategy may very well have produced a differentiated yet more complex reporting guideline.

Fig 3 .
Fig 3.An example of a page from the questionnaire on judgment certainty and a text box for comments (Source: Unipark).https://doi.org/10.1371/journal.pone.0304651.g003

Table 4 . Results for the topic of Context.
Information if the Delphi study is combined with another study (e.g., systematic review to develop the questionnaire, focus group with patients to discuss the Delphi results) https://doi.org/10.1371/journal.pone.0304651.t004

We use the term "questionnaire" for the survey instrument regardless of whether quantitative or qualitative items are integrated or weighted.
https://doi.org/10.1371/journal.pone.0304651.t005